00. Create/Clone a github repository

01. Create RStudio Project

02. Setup RStudio

03. Download .CSV Files

The three datasets that will be used are:

The original and reformatted CSV’s can be found in 01 Data folder.

04. ETL & Create Table on Database

CREATE TABLE Refugee_Stats ( Asylum_Country varchar2(4000), Origin_Country varchar2(4000), Record_Year number(38,4), Refugees number(38,4), Asylum_seekers number(38,4), Returned_Refugees number(38,4), IDPs number(38,4), Returned_IDPs number(38,4), Stateless_persons number(38,4), Others_of_concern number(38,4), Total_Population number(38,4) );

TABLE “MILITARIZED_DISPUTES”

STATE_ABBR VARCHAR2(4000 BYTE) Yes 1
DISPNUM3 NUMBER(38,4) Yes 2
DISPNUM4 NUMBER(38,4) Yes 3
COUNTRY_CODE NUMBER(38,4) Yes 4
START_DAY NUMBER(38,4) Yes 5
START_MON NUMBER(38,4) Yes 6
START_YEAR NUMBER(38,4) Yes 7
END_DAY NUMBER(38,4) Yes 8
END_MON NUMBER(38,4) Yes 9
END_YEAR NUMBER(38,4) Yes 10
SIDE_A_BOOL NUMBER(1,0) Yes 11
REVISIONIST_STATE_BOOL NUMBER(1,0) Yes 12
REVISION_TYPE2 NUMBER(38,4) Yes 13
REVISION_TYPE1 NUMBER(38,4) Yes 14
FATALITY_LEVEL NUMBER(38,4) Yes 15
FATALITY_PRECISE NUMBER(38,4) Yes 16
HIGHEST_ACTION NUMBER(38,4) Yes 17
HOSTILITY_LEVEL NUMBER(38,4) Yes 18
ORIGINATOR_BOOL NUMBER(1,0) Yes 19

TABLE “RELIGIONS_BY_NATION”

STATE VARCHAR2(4000 BYTE) Yes 1
NAME VARCHAR2(4000 BYTE) Yes 2
DATATYPE VARCHAR2(4000 BYTE) Yes 3
SOURCERELIAB VARCHAR2(4000 BYTE) Yes 4
RECRELIAB VARCHAR2(4000 BYTE) Yes 5
RELIABILEVEL VARCHAR2(4000 BYTE) Yes 6
VERSION VARCHAR2(4000 BYTE) Yes 7
SOURCECODE VARCHAR2(4000 BYTE) Yes 8
YEAR NUMBER(38,4) Yes 9
CHRSTPROT NUMBER(38,4) Yes 10
CHRSTCAT NUMBER(38,4) Yes 11
CHRSTORTH NUMBER(38,4) Yes 12
CHRSTANG NUMBER(38,4) Yes 13
CHRSTOTHR NUMBER(38,4) Yes 14
CHRSTGEN NUMBER(38,4) Yes 15
JUDORTH NUMBER(38,4) Yes 16
JDCONS NUMBER(38,4) Yes 17
JUDREF NUMBER(38,4) Yes 18
JUDOTHR NUMBER(38,4) Yes 19
JUDGEN NUMBER(38,4) Yes 20
ISLMSUN NUMBER(38,4) Yes 21
ISLMSHI NUMBER(38,4) Yes 22
ISLMIBD NUMBER(38,4) Yes 23
ISLMNAT NUMBER(38,4) Yes 24
ISLMALW NUMBER(38,4) Yes 25
ISLMAHM NUMBER(38,4) Yes 26
ISLMOTHR NUMBER(38,4) Yes 27
ISLMGEN NUMBER(38,4) Yes 28
BUDMAH NUMBER(38,4) Yes 29
BUDTHR NUMBER(38,4) Yes 30
BUDOTHR NUMBER(38,4) Yes 31
BUDGEN NUMBER(38,4) Yes 32
ZOROGEN NUMBER(38,4) Yes 33
HINDGEN NUMBER(38,4) Yes 34
SIKHGEN NUMBER(38,4) Yes 35
SHNTGEN NUMBER(38,4) Yes 36
BAHGEN NUMBER(38,4) Yes 37
TAOGEN NUMBER(38,4) Yes 38
JAINGEN NUMBER(38,4) Yes 39
CONFGEN NUMBER(38,4) Yes 40
SYNCGEN NUMBER(38,4) Yes 41
ANMGEN NUMBER(38,4) Yes 42
NONRELIG NUMBER(38,4) Yes 43
OTHRGEN NUMBER(38,4) Yes 44
SUMRELIG NUMBER(38,4) Yes 45
POP NUMBER(38,4) Yes 46
CHRSTPROTPCT NUMBER(38,4) Yes 47
CHRSTCATPCT NUMBER(38,4) Yes 48
CHRSTORTHPCT NUMBER(38,4) Yes 49
CHRSTANGPCT NUMBER(38,4) Yes 50
CHRSTOTHRPCT NUMBER(38,4) Yes 51
CHRSTGENPCT NUMBER(38,4) Yes 52
JUDORTHPCT NUMBER(38,4) Yes 53
JUDCONSPCT NUMBER(38,4) Yes 54
JUDREFPCT NUMBER(38,4) Yes 55
JUDOTHRPCT NUMBER(38,4) Yes 56
JUDGENPCT NUMBER(38,4) Yes 57
ISLMSUNPCT NUMBER(38,4) Yes 58
ISLMSHIPCT NUMBER(38,4) Yes 59
ISLMIBDPCT NUMBER(38,4) Yes 60
ISLMNATPCT NUMBER(38,4) Yes 61
ISLMALWPCT NUMBER(38,4) Yes 62
ISLMAHMPCT NUMBER(38,4) Yes 63
ISLMOTHRPCT NUMBER(38,4) Yes 64
ISLMGENPCT NUMBER(38,4) Yes 65
BUDMAHPCT NUMBER(38,4) Yes 66
BUDTHRPCT NUMBER(38,4) Yes 67
BUDOTHRPCT NUMBER(38,4) Yes 68
BUDGENPCT NUMBER(38,4) Yes 69
ZOROGENPCT NUMBER(38,4) Yes 70
HINDGENPCT NUMBER(38,4) Yes 71
SIKHGENPCT NUMBER(38,4) Yes 72
SHNTGENPCT NUMBER(38,4) Yes 73
BAHGENPCT NUMBER(38,4) Yes 74
TAOGENPCT NUMBER(38,4) Yes 75
JAINGENPCT NUMBER(38,4) Yes 76
CONFGENPCT NUMBER(38,4) Yes 77
SYNCGENPCT NUMBER(38,4) Yes 78
ANMGENPCT NUMBER(38,4) Yes 79
NONRELIGPCT NUMBER(38,4) Yes 80
OTHRGENPCT NUMBER(38,4) Yes 81
SUMRELIGPCT NUMBER(38,4) Yes 82
TOTAL NUMBER(38,4) Yes 83
DUALRELIG NUMBER(38,4) Yes 84

05. Visualizations

<<<<<<< .merge_file_4TZ4Pp

Denominational Dispersions

  • I thought it might be interesting, rather than comparing different religions to each other like so many apples and oranges, to compare different sects and denominations within a religion. To start, I laid out the two largest sects of the three largest religions against the “other” subset of that religion, to see how strongly these religions each coalesced into major denominations, like so:

  • From this you can see that while populations often split comfortably across Protestant or Catholic and others, they either ascribe to one the major Buddhist and Islamic sects entirely or not at all.

  • When I tried paginating these results by year, something very suprising came to light:

=======

Quan’s Stories

A. Start with a green thing in non-aggregated mode and make a boxplot

For Religions by Nation dataset:

  • Put Name into Columns
  • Put Budgen into Rows
  • Choose Boxplot in Show Me

  • Make a new barchart: put Name into Columns, Budgen into Rows
  • Pick out interesting countries (filtering by Name as CHN, DRV, JPN, MYA, THI)
  • Filter by Year greater than or equal to 1955 (to eliminate null values)
  • Put Year in Pages to create a timeline slider
  • Put Name in Marks as Color for labeling

Interesting Findings:

  • China vs Japan: Even though China and Japan has a striking difference in total population, for the early period (1945 – 1970), their Buddhist population was comparable (with Japan’s being little higher). However, from 1970 onwards, Buddhist population in China increased markedly and steadily, outstripping Japan completely.

  • The interesting point about China’s increase is that even though it was rapid, there was a sudden drop (by no less than 50%) between 1995 and 2000, followed by a sudden comeback and plateau right in 2005.

  • For Vietnam, after long period of slow but steady increase in Buddhist population between 1955-1995, there appears a sharp increase and plateau in 2000 (from 6.5M to 40M between 1995 and 2000).

B. Start with two green things and a map

For Religions by Nation dataset:

  • Choose Name’s Geographic Role as Country/Region, then Add to Sheet
  • Filter by Nonreligpct to be at least 15% (eliminating insignificant data)
  • Put Year in Pages to create a timeline slider
  • For Marks, put Name into Color as labeling, and Nonreligpct into Label for showing values

  • Create a new scatterplot: put Year into Columns and Nonreligpct into Rows
  • Put Name into Columns for different facets
  • Pick out interesting countries (filtering by Name as CHN, CUB, CZE, FIN, JAM, PRK, RUS, USA)
  • Put Name into Color in Marks for labeling

Interesting Findings:

  • The behavior of nonreligious percentage in Czechoslovakia, Finland, and USA are considered normal. Thus, they are showned just for comparison.

  • The interesting behaviors are in countries that have an abrupt change in trend, or a sudden increase/decrease in nonreligious percentage of at least 10%.

  • Russia: sudden drop from 63% in 1995 to 12% in 2000.

  • North Korea: 79% in 1995 to 64% in 2000.

  • Jamaica: downward slope from 1960-1980, but a sharp turn upwards to 1990 and then gradual decrease.

  • Cuba: sharp increase from 0% in 1965 to 25% in 1970.

  • China: sharp increase in period 1945-1960.

C. Start with a green thing in aggregated mode and make a histogram

For Militarized Disputes dataset:

  • Create a calculated field named Dispute Length = [End Year] - [Start Year] + 1
  • Choose Histogram in Show Me
  • In Dimensions, choose to edit Dispute Length (bin) and change bin size to 1

Comment: Result shows highest count of dispute length to be for 1 year, and the count decreases significantly as dispute length increases, which is expected and not interesting.

>>>>>>> .merge_file_6CGJqb

1995

2000

06. Shiny App for Interesting Visualizations